Location tracking from natural speech

ABSTRACT

A headset computer device provides user voice indication of location of the device. The user may implicitly or explicitly present by voice input his and hence the HSC device location. A voice driven location module is coupled to the voice recognition engine, a map database and GPS of the HSC device. Based on user voiced indications of 3D space location, the voice driven location module determines device location and resets 3D space location accordingly.

RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application No.61/921,307, filed on Dec. 27, 2013. The entire teachings of the aboveapplication(s) are incorporated herein by reference.

BACKGROUND OF THE INVENTION

Mobile computing devices, such as a laptop or notebook PC, a smartphone, and tablet computing device, are now common tools used forproducing, analyzing, communicating, and consuming data in both businessand personal life. Consumers continue to embrace a mobile digitallifestyle as the ease of access to digital information increases withhigh speed wireless communications technologies becoming ubiquitous.Popular uses of mobile computing devices include displaying largeamounts of high-resolution computer graphics information and videocontent, often wirelessly streamed to the device. While these devicestypically include a display screen, the preferred visual experience of ahigh resolution, large format display cannot be easily replicated insuch mobile devices because the physical size of such device is limitedto promote mobility. Another drawback of the aforementioned device typesis that the user interface is hands-dependent, typically requiring auser to enter data or make selections using a keyboard (physical orvirtual) or touch-screen display. As a result, consumers are now seekinga hands-free, high quality, portable, color display solution to augmentor replace their hands-dependent mobile devices.

Location awareness is increasingly important in many aspects of personalcomputing, especially those involving mobile devices. The standard wayof obtaining location is to rely on GPS chipsets built into the deviceitself. However this only works reliably when the device has a clearline-of-sight view of the GPS satellites overhead. When operatingindoors this is rarely the case, and GPS often fails to return anaccurate position when within four walls. Several alternative indoorpositioning methods exist, including inertial navigation and wirelessbeaconing. Inertial navigation uses accelerometers to track movementwith respect to an initial position. Wireless beaconing uses fixedlocation wireless radio antennas (radio beacons) attached to structures,along with various forms of radio triangulation, to calculate exact orapproximate mobile user position. Various wireless protocols may beused, for example WiFi, Cellular, and Bluetooth.

All of these indoor methods have limitations. Wireless beaconingrequires extensive infrastructure to be set up and maintained.

The accelerometers used for inertial navigation have accuracylimitations, so that position calculations based on them suffer from“drift.” The longer time period over which inertial navigation (alsoreferred to as dead reckoning navigation) is used, the more total“drift” is accumulated, which corresponds to error in the user'sestimated location or position.

The state of the art approach to indoor navigation currently combinesaspects of both wireless beaconing with accelerometer tracking Such apositioning device obtains an accurate fix of its indoor location by wayof a short range wireless beacon. Once the fix is obtained,accelerometers try to keep the location estimate up to date as thedevice moves out of range of distributed, non-overlapping beacons.

Accelerometers may have a drift of about 5-10%. That means after walkingaround 100 feet following the last fix from a beacon, it is expectedthat that device's position estimate may be in error by +/−5 to 10 feet.

SUMMARY OF THE INVENTION

Recently developed micro-displays can provide large-format,high-resolution color pictures and streaming video in a very small formfactor. One application for such displays can be integrated into awireless headset computer worn on the head of the user with a displaywithin the field of view of the user, similar in format to eyeglasses,audio headset or video eyewear.

A “wireless computing headset” device, also referred to herein as aheadset computer (HSC) or head mounted display (HMD), includes one ormore small, high resolution micro-displays and associated optics tomagnify the image. The high resolution micro-displays can provide supervideo graphics array (SVGA) (800×600) resolution or extended graphicarrays (XGA) (1024×768) resolution, or higher resolutions known in theart.

A wireless computing headset contains one or more wireless computing andcommunication interfaces, enabling data and streaming video capability,and provides greater convenience and mobility through hands dependentdevices.

For more information concerning such devices, see co-pending patentapplications entitled “Mobile Wireless Display Software Platform forControlling Other Systems and Devices,” U.S. application Ser. No.12/348, 648 filed Jan. 5, 2009, “Handheld Wireless Display DevicesHaving High Resolution Display Suitable For Use as a Mobile InternetDevice,” PCT International Application No. PCT/US09/38601 filed Mar. 27,2009, and “Improved Headset Computer,” U.S. Application No. 61/638,419filed Apr. 25, 2012, each of which are incorporated herein by referencein their entirety.

As used herein “HSC” headset computers, “HMD” head mounded displaydevice, and “wireless computing headset” device may be usedinterchangeably.

Embodiments of the present invention concern using speech utterances toenhance and/or replace the wireless beaconing. In addition to all otherforms of inertial navigation and dead reckoning using radio beacons andaccelerometers to estimate a mobile users location or position and toperiodically reset the location of the tracking device, one or morespoken words can be used to accurately position or locate a mobile useragainst a known floor plan, schematic or map. This would be especiallytrue of head-worn devices already driven primarily by voice, which aredesigned to listen for and understand speech commands.

In one aspect, the invention is a headset computer device including amicrodisplay driven by a processor, a microphone coupled to provideuser-voiced input to the processor, and a voice location module. Thevoice location module is executed by the processor, and is configured toestablish a location of the device based on the user-voiced input.

In one embodiment, the processor resets the established location basedon the user-voiced input. In another embodiment, the voice locationmodule is further configured to establish the location of the devicewithin a three dimensional coordinate system. In another embodiment, theuser-voiced input is in response to a solicitation from the headsetcomputer device. Such a solicitation may include a message presented onthe microdisplay, an audible message, or other such message communicatedto the user.

In another embodiment, the headset computer device is further configuredto extract the user-voiced input from unsolicited utterances. Theunsolicited utterances may include utterances spoken by the user duringthe normal course of his activities, or they may be specificallysubmitted by the user as, for example, a user's intentional attempt toprovide location information to the headset computer device. In oneembodiment, the user-voiced input includes information describingproximity to an object, landmark or building feature (e.g., visitoralcove or secretary station).

In another embodiment, the headset computer device is further configuredto cross-reference the information describing proximity to an objectwith a reference plan of known objects. In one embodiment the headsetcomputer device is further configured to identify a match between theinformation describing proximity to an object and a known objectdescribed in the reference plan.

In one embodiment, the headset computer is further configured to useinertial navigation to determine a subsequent location of the device,after the voice location module establishes the location of the device.

In another aspect, the invention is a method of enhancing a locationsystem, including recognizing a user utterance as a description of anobject disposed within a region, identifying the object within a map ofthe region, extracting locational coordinates associated with the objectfrom the map of the region, and establishing a location of the devicebased on the locational coordinates.

In one embodiment, the method further includes resetting the establishedlocation based on the user voiced input. In another embodiment, themethod further includes establishing the location of the device within athree dimensional coordinate system.

In another embodiment, the method further includes cross-referencing theinformation describing proximity to an object with a reference plan ofknown objects. In one embodiment, the method further includesidentifying a match between the information describing proximity to anobject and a known object described in the reference plan.

Another embodiment further includes using inertial navigation todetermine a subsequent location of the device, after the voice locationmodule establishes the location of the device.

In another aspect, the invention includes a non-transitorycomputer-readable medium for recognizing speech, the non-transitorycomputer-readable medium comprising computer software instructionsstored thereon. The computer software instructions, when executed by atleast one processor, causes a computer system to recognize a userutterance as a description of an object disposed within a region,identify the object within a map of the region, extract locationalcoordinates associated with the object from the map of the region, andestablish a location of the device based on the locational coordinates.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing will be apparent from the following more particulardescription of example embodiments of the invention, as illustrated inthe accompanying drawings in which like reference characters refer tothe same parts throughout the different views. The drawings are notnecessarily to scale, emphasis instead being placed upon illustratingembodiments of the present invention.

FIGS. 1A-1B are schematic illustrations of a headset computercooperating with a host computer (e.g., Smart Phone, laptop, etc.)according to principles of the present invention.

FIG. 2 is a block diagram of flow of data and control in the embodimentof FIGS. 1A-1B.

FIG. 3 is a block diagram of an automatic speech recognition (ASR)subsystem in embodiments.

FIG. 4 illustrates an embodiment of a method of enhancing a locationsystem according to the invention.

DETAILED DESCRIPTION OF THE INVENTION

A description of example embodiments of the invention follows.

The teachings of all patents, published applications and referencescited herein are incorporated by reference in their entirety.

FIGS. 1A and 1B show an example embodiment of a wireless computingheadset device 100 (also referred to herein as a headset computer (HSC)or head mounted display (HMD)) that incorporates a high-resolution (VGAor better) micro-display element 1010, and other features describedbelow.

HSC 100 can include audio input and/or output devices, including one ormore microphones, input and output speakers, geo-positional sensors(GPS), three to nine axis degrees of freedom orientation sensors,atmospheric sensors, health condition sensors, digital compass, pressuresensors, environmental sensors, energy sensors, acceleration sensors,position, attitude, motion, velocity and/or optical sensors, cameras(visible light, infrared, etc.), multiple wireless radios, auxiliarylighting, rangefinders, or the like and/or an array of sensors embeddedand/or integrated into the headset and/or attached to the device via oneor more peripheral ports 1020 (FIG. 1B).

Typically located within the housing of headset computing device 100 arevarious electronic circuits including, a microcomputer (single ormulticore processors), one or more wired and/or wireless communicationsinterfaces, memory or storage devices, various sensors and a peripheralmount or mount, such as a “hot shoe.”

Example embodiments of the HSC 100 can receive user input throughsensing voice commands, head movements, 110, 111, 112 and hand gestures113, or any combination thereof. A microphone (or microphones)operatively coupled to or integrated into the HSC 100 can be used tocapture speech commands, which are then digitized and processed usingautomatic speech recognition techniques. Gyroscopes, accelerometers, andother micro-electromechanical system sensors can be integrated into theHSC 100 and used to track the user's head movements 110, 111, 112 toprovide user input commands. Cameras or motion tracking sensors can beused to monitor a user's hand gestures 113 for user input commands. Sucha user interface may overcome the disadvantages of hands-dependentformats inherent in other mobile devices.

The HSC 100 can be used in various ways. It can be used as a peripheraldisplay for displaying video signals received and processed by a remotehost computing device 200 (shown in FIG. 1A). The host 200 may be, forexample, a notebook PC, smart phone, tablet device, or other computingdevice having less or greater computational complexity than the wirelesscomputing headset device 100, such as cloud-based network resources. Theheadset computing device 100 and host 200 can wirelessly communicate viaone or more wireless protocols, such as Bluetooth®, Wi-Fi, WiMAX, 4G LTEor other wireless radio link 150. (Bluetooth is a registered trademarkof Bluetooth Sig, Inc. of 5209 Lake Washington Boulevard, Kirkland,Wash. 98033).

In an example embodiment, the host 200 may be further connected to othernetworks, such as through a wireless connection to the Internet or othercloud-based network resources, so that the host 200 can act as awireless relay between the HSC 100 and the network 210. Alternatively,some embodiments of the HSC 100 can establish a wireless connection tothe Internet (or other cloud-based network resources) directly, withoutthe use of a host wireless relay. In such embodiments, components of theHSC 100 and the host 200 may be combined into a single device.

FIG. 1B is a perspective view showing some details of an exampleembodiment of a headset computer 100. The example embodiment HSC 100generally includes, a frame 1000, strap 1002, rear housing 1004, speaker1006, cantilever, or alternatively referred to as an arm or boom 1008with a built in microphone, and a micro-display subassembly 1010.

A head worn frame 1000 and strap 1002 are generally configured so that auser can wear the headset computer device 100 on the user's head. Ahousing 1004 is generally a low profile unit which houses theelectronics, such as the microprocessor, memory or other storage device,along with other associated circuitry. Speakers 1006 provide audiooutput to the user so that the user can hear information. Micro-displaysubassembly 1010 is used to render visual information to the user. It iscoupled to the arm 1008. The arm 1008 generally provides physicalsupport such that the micro-display subassembly is able to be positionedwithin the user's field of view 300 (FIG. 1A), preferably in front ofthe eye of the user or within its peripheral vision preferably slightlybelow or above the eye. Arm 1008 also provides the electrical or opticalconnections between the micro-display subassembly 1010 and the controlcircuitry housed within housing unit 1004.

According to aspects that will be explained in more detail below, theHSC display device 100 allows a user to select a field of view 300within a much larger area defined by a virtual display 400. The user cantypically control the position, extent (e.g., X-Y or 3D range), and/ormagnification of the field of view 300.

While what is shown in FIGS. 1A and 1B is a monocular micro-displaypresenting a single fixed display element supported on the face of theuser with a cantilevered boom, it should be understood that othermechanical configurations for the remote control display device 100 arepossible, such as a binocular display with two separate micro-displays(e.g., one for each eye) or a single micro-display arranged to beviewable by both eyes.

FIG. 2 is a block diagram showing more detail of an embodiment of theHSC or HMD device 100, host 200 and the data that travels between them.The HSC or HMD device 100 receives vocal input from the user via themicrophone, hand movements or body gestures via positional andorientation sensors, the camera or optical sensor(s), and head movementinputs via the head tracking circuitry such as 3 axis to 9 axis degreesof freedom orientational sensing. These are translated by software(processors) in the HSC or HMD device 100 into keyboard and/or mousecommands that are then sent over the Bluetooth or other wirelessinterface 150 to the host 200. The host 200 then interprets thesetranslated commands in accordance with its own operatingsystem/application software to perform various functions. Among thecommands is one to select a field of view 300 within the virtual display400 and return that selected screen data to the HSC or HMD device 100.Thus, it should be understood that a very large format virtual displayarea might be associated with application software or an operatingsystem running on the host 200. However, only a portion of that largevirtual display area 400 within the field of view 300 is returned to andactually displayed by the micro display 1010 of HSC or HMD device 100.

In one embodiment, the HSC 100 may take the form of the device describedin a co-pending US Patent Publication Number 2011/0187640, which ishereby incorporated by reference in its entirety.

In another embodiment, the invention relates to the concept of using aHead Mounted Display (HMD) 1010 in conjunction with an external ‘smart’device 200 (such as a smartphone or tablet) to provide information andcontrol to the user hands-free. The invention requires transmission ofsmall amounts of data, providing a more reliable data transfer methodrunning in real-time.

In this sense therefore, the amount of data to be transmitted over theconnection 150 is small—simply instructions on how to lay out a screen,which text to display, and other stylistic information such as drawingarrows, or the background colors, images to include, etc.

Additional data could be streamed over the same 150 or anotherconnection and displayed on screen 1010, such as a video stream ifrequired by the host 200.

FIG. 3 shows an example embodiment of a wireless hands-free videocomputing headset 100 under voice command, according to one embodimentof the present invention. The user may be presented with an image on themicro-display 9010, for example, as output by host computer 200mentioned above. A user of the HMD 100 can additionally use voicelocation software module 9036, either locally or from a remote host 200,in which the user is presented with an image of a message box, text boxor dialogue box prompting user input on the microdisplay 9010 and theaudio of the same through the speaker 9006 of the headset computer 100.Because the headset computer 100 is also equipped with a microphone9020, the user can utter the command words or phrase (command selection)as illustrated next with respect to embodiments of the presentinvention.

FIG. 3 shows a schematic diagram illustrating the modules of the headsetcomputer 100. FIG. 3 includes a schematic diagram of the operativemodules of the headset computer 100. For the case of voice location inspeech driven applications controller 9100 accesses voice locationmodule 9036, which can be located locally to each HMD 100 or locatedremotely at a host 200 (FIGS. 1A-1B).

Voice location software module 9036 contains instructions to display toa user an image of a pertinent message box or the like. The graphicsconverter module 9040 converts the image instructions received from thevoice location module 9036 via bus 9103 and converts the instructionsinto graphics to display on the monocular display 9010.

At the same time, text-to-speech module 9035 b converts instructionsreceived from voice location software module 9036 to create soundsrepresenting the contents for the image to be displayed. Theinstructions are converted into digital sounds representing thecorresponding image contents that the text-to-speech module 9035 b feedsto the digital-to-analog converter 9021 b, which in turn feeds speaker9006 to present the audio to the user. Voice location software module9036 can be stored locally at memory 9120 or remotely at a host 200(FIG. 1A).

Voice location software module 9036 contains instructions to display toa user an image of a pertinent message box or the like. The graphicsconverter module 9040 converts the image instructions received from thevoice location module 9036 via bus 9103 and converts the instructionsinto graphics to display on the monocular display 9010. At the same timetext-to-speech module 9035 b converts instructions received from voicelocation software module 9036 to create sounds representing the contentsfor the image to be displayed. The instructions are converted intodigital sounds representing the corresponding image contents that thetext-to-speech module 9035 b feeds to the digital-to-analog converter9021 b, which in turn feeds speaker 9006 to present the audio to theuser. Voice location software module 9036 can be stored locally atmemory 9120 or remotely at a host 200 (FIG. 1A). The user canspeak/utter the command selection from the image and the user's speech9090 is received at microphone 9020. The received speech is thenconverted from an analog signal into a digital signal atanalog-to-digital converter 9021 a. Once the speech is converted from ananalog to a digital signal speech recognition module 9035 a processesthe speech into recognized speech. The recognized speech is comparedagainst known speech according to the instructions of the voice locationmodule 9036.

A voice driven location system may be used, for example, in ExplicitLocation mode or Passive Location mode.

For Explicit Location (EL) mode, a device 100/module 9036 couldexplicitly ask the user to describe his location, for example byprompting the user with a list of known way-points. In one embodiment,HSC 100 presents a map of a region (e.g, a room or floor plan within abuilding) to the user, with certain items highlighted, such as“front-door,” “back-door,” “window,” “microwave oven,” and so on. Theuser then speaks a command, such as “I am standing next to the microwaveoven.” Based on such information the HSC 100/module 9036 looks upposition information associated with the microwave oven, from apre-populated database (e.g., from an absolute scale floor plan, aschematic, a map and/or GPS coordinates of the microwave oven), and usesthe spoken command to reset the mobile user's device location, therebycreating an absolute fix. The pre-populated database may be stored, forexample, in memory 9120.

Once the device location has been reset, dead reckoning or inertialnavigation (e.g., accelerometer-based navigation) may take place untilsufficient drift requires another absolute fix. Sufficient drift may bedetermined by comparing the actual drift to a predetermined threshold.The threshold may be preset, or it may be programmed by the user.

The scale floor plan, schematic, map and/or GPS coordinates of key itemsin and around a room or floor-plan may be known ahead of time, or may becalculable in real time. To be calculable in real time, the system 100may employ both an accurate distribution floor-plan, and the GPScoordinates of certain aspects of (e.g., the corners of) the floor-plan.HSC 100 may then utilize interpolation, based on the relative itemlocations and the GPS coordinates (or other absolute coordinates) tocalculate the position of all items within the floor-plan.

For Passive Location (PL) mode, the system 100 may be expanded to workpassively, i.e., in the background. PL mode is based on the idea that aperson, during the course of ordinary activities and conversations, mayprovide conversational clues as to his location. In PL mode, the HSC 100monitors any ongoing utterances being made by the user, and extractsrelevant utterances, rather that explicitly soliciting such utterances.Keyword spotting and natural language processing by HSC 100 is activatedduring PL mode, so that the HSC 100 may search for and find phrases suchas “I'm coming up the stairs now,” or “Hurry up, I'm waiting for you bythe elevator,” and similar phrases associated with location. Module 9036may analyze these phrases and extract keywords such as “stairs” and“elevator,” and other location reference words. Module 9036cross-references the extracted location words with a reference plan thatdescribes known objects in the current floor plan, schematic or map totranspose the extracted location words to an accurate floor plan,schematic, map and/or GPS locations. The reference plan may reside in adatabase or other such structure for storing an organized collection ofdata, stored, for example, in memory 9120. HMC 100 may match anextracted location word to a known object from the reference plan, anduse coordinates of the known object from the reference plan to determinethe user's/device location (i.e., the location of the HMC). The HMC 100may reset device location according to those coordinates, as previouslydescribed.

FIG. 4 illustrates an embodiment of a method of enhancing a locationsystem according to the invention, including recognizing 402 a userutterance as a description of an object disposed within a region,identifying 404 the object within a map of the region, extracting 406locational coordinates associated with the object from the map of theregion, and establishing 408 a location of the device based on thelocational coordinates.

It will be apparent that one or more embodiments described herein may beimplemented in many different forms of software and hardware. Softwarecode and/or specialized hardware used to implement embodiments describedherein is not limiting of the embodiments of the invention describedherein. Thus, the operation and behavior of embodiments are describedwithout reference to specific software code and/or specializedhardware—it being understood that one would be able to design softwareand/or hardware to implement the embodiments based on the descriptionherein.

Further, certain embodiments of the example embodiments described hereinmay be implemented as logic that performs one or more functions. Thislogic may be hardware-based, software-based, or a combination ofhardware-based and software-based. Some or all of the logic may bestored on one or more tangible, non-transitory, computer-readablestorage media and may include computer-executable instructions that maybe executed by a controller or processor. The computer-executableinstructions may include instructions that implement one or moreembodiments of the invention. The tangible, non-transitory,computer-readable storage media may be volatile or non-volatile and mayinclude, for example, flash memories, dynamic memories, removable disks,and non-removable disks.

While this invention has been particularly shown and described withreferences to example embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

What is claimed is:
 1. A headset computer device comprising: amicrodisplay driven by a processor; a microphone coupled to provideuser-voiced input to the processor; and a voice location module,executed by the processor, configured to (i) extract one or morelocation-related phrases from an ongoing user utterance that wasproduced during ordinary conversation, conveyed through the user-voicedinput; (ii) analyze the one or more location-related phrases to identifyat least one location-related reference word; and (iii) establish alocation of the device based on the at least one location-relatedreference word.
 2. The headset computer device of claim 1, wherein theprocessor resets the established location based on the user-voicedinput.
 3. The headset computer device of claim 1, wherein the voicelocation module is further configured to establish the location of thedevice within a three dimensional coordinate system.
 4. The headsetcomputer device of claim 1, wherein the user-voiced input is in responseto a solicitation from the headset computer device.
 5. The headsetcomputer device of claim 1, wherein the headset computer device isfurther configured to extract the user-voiced input from unsolicitedutterances.
 6. The headset computer device of claim 1, wherein theuser-voiced input includes information describing proximity to anobject.
 7. The headset computer device of claim 6, wherein the headsetcomputer device is further configured to cross-reference the informationdescribing proximity to an object with a reference plan of knownobjects.
 8. The headset computer device of claim 6, wherein the headsetcomputer device is further configured to identify a match between theinformation describing proximity to an object and a known objectdescribed in the reference plan.
 9. The headset computer device of claim1, wherein the headset computer is further configured to use inertialnavigation to determine a subsequent location of the device, after thevoice location module establishes the location of the device.
 10. Amethod of enhancing a location system, comprising: by a digitalprocessing device, (i) extracting one or more location-related phrasesfrom an ongoing user utterance that was produced during ordinaryconversation; (ii) analyzing the one or more location related phrases toidentify at least one location-related reference word; (iii) recognizingthe location-related reference word as a description of an objectdisposed within a region; (iv) identifying the object within a map ofthe region; (v) extracting locational coordinates associated with theobject from the map of the region; and (vi) establishing a location ofthe device based on the locational coordinates.
 11. The method of claim10, further including resetting the established location based on theuser voiced input.
 12. The method of claim 10, further includingestablishing the location of the device within a three dimensionalcoordinate system.
 13. The method of claim 10, wherein the descriptionof the object includes proximity to an object.
 14. The method of claim10, wherein the user-voiced input is in response to a solicitation fromthe headset computer device.
 15. The method of claim 10, wherein theheadset computer device extracts the user-voiced input from unsolicitedutterances.
 16. The method of claim 10, wherein the user-voiced inputincludes information describing proximity to an object.
 17. The methodof claim 16, further including cross-referencing the informationdescribing proximity to an object with a reference plan of knownobjects.
 18. The method of claim 16, further including identifying amatch between the information describing proximity to an object and aknown object described in the reference plan.
 19. The method of claim10, further including using inertial navigation to determine asubsequent location of the device, after the voice location moduleestablishes the location of the device.
 20. A non-transitorycomputer-readable medium for recognizing speech, the non-transitorycomputer-readable medium comprising computer software instructionsstored thereon, the computer software instructions, when executed by atleast one processor, cause a computer system to: (i) extract one or morelocation-related phrases form an ongoing user utterance that wasproduced during ordinary conversation; (ii) analyze the one or morelocation related phrases to identify at least one location-relatedreference word; (i) recognize the location-related reference word as adescription of an object disposed within a region; (ii) identify theobject within a map of the region; (iii) extract locational coordinatesassociated with the object from the map of the region; and (iv)establish a location of a user who produced the user utterance based onthe locational coordinates.